Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Sensors (Basel) ; 24(4)2024 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-38400392

RESUMO

In this paper, we present and evaluate a calibration-free mobile eye-traking system. The system's mobile device consists of three cameras: an IR eye camera, an RGB eye camera, and a front-scene RGB camera. The three cameras build a reliable corneal imaging system that is used to estimate the user's point of gaze continuously and reliably. The system auto-calibrates the device unobtrusively. Since the user is not required to follow any special instructions to calibrate the system, they can simply put on the eye tracker and start moving around using it. Deep learning algorithms together with 3D geometric computations were used to auto-calibrate the system per user. Once the model is built, a point-to-point transformation from the eye camera to the front camera is computed automatically by matching corneal and scene images, which allows the gaze point in the scene image to be estimated. The system was evaluated by users in real-life scenarios, indoors and outdoors. The average gaze error was 1.6∘ indoors and 1.69∘ outdoors, which is considered very good compared to state-of-the-art approaches.


Assuntos
Movimentos Oculares , Fixação Ocular , Tecnologia de Rastreamento Ocular , Córnea/diagnóstico por imagem , Algoritmos
2.
Sci Rep ; 13(1): 14679, 2023 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-37674052

RESUMO

Despite the wide range of uses of rabbits (Oryctolagus cuniculus) as experimental models for pain, as well as their increasing popularity as pets, pain assessment in rabbits is understudied. This study is the first to address automated detection of acute postoperative pain in rabbits. Using a dataset of video footage of n = 28 rabbits before (no pain) and after surgery (pain), we present an AI model for pain recognition using both the facial area and the body posture and reaching accuracy of above 87%. We apply a combination of 1 sec interval sampling with the Grayscale Short-Term stacking (GrayST) to incorporate temporal information for video classification at frame level and a frame selection technique to better exploit the availability of video data.


Assuntos
Meios de Comunicação , Aprendizado Profundo , Lagomorpha , Animais , Coelhos , Dor Pós-Operatória , Face
3.
Sci Rep ; 13(1): 8973, 2023 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-37268666

RESUMO

Manual tools for pain assessment from facial expressions have been suggested and validated for several animal species. However, facial expression analysis performed by humans is prone to subjectivity and bias, and in many cases also requires special expertise and training. This has led to an increasing body of work on automated pain recognition, which has been addressed for several species, including cats. Even for experts, cats are a notoriously challenging species for pain assessment. A previous study compared two approaches to automated 'pain'/'no pain' classification from cat facial images: a deep learning approach, and an approach based on manually annotated geometric landmarks, reaching comparable accuracy results. However, the study included a very homogeneous dataset of cats and thus further research to study generalizability of pain recognition to more realistic settings is required. This study addresses the question of whether AI models can classify 'pain'/'no pain' in cats in a more realistic (multi-breed, multi-sex) setting using a more heterogeneous and thus potentially 'noisy' dataset of 84 client-owned cats. Cats were a convenience sample presented to the Department of Small Animal Medicine and Surgery of the University of Veterinary Medicine Hannover and included individuals of different breeds, ages, sex, and with varying medical conditions/medical histories. Cats were scored by veterinary experts using the Glasgow composite measure pain scale in combination with the well-documented and comprehensive clinical history of those patients; the scoring was then used for training AI models using two different approaches. We show that in this context the landmark-based approach performs better, reaching accuracy above 77% in pain detection as opposed to only above 65% reached by the deep learning approach. Furthermore, we investigated the explainability of such machine recognition in terms of identifying facial features that are important for the machine, revealing that the region of nose and mouth seems more important for machine pain classification, while the region of ears is less important, with these findings being consistent across the models and techniques studied here.


Assuntos
Face , Dor , Humanos , Gatos , Animais , Dor/diagnóstico , Dor/veterinária , Nariz , Expressão Facial , Medição da Dor/métodos
4.
Sci Rep ; 12(1): 22611, 2022 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-36585439

RESUMO

In animal research, automation of affective states recognition has so far mainly addressed pain in a few species. Emotional states remain uncharted territories, especially in dogs, due to the complexity of their facial morphology and expressions. This study contributes to fill this gap in two aspects. First, it is the first to address dog emotional states using a dataset obtained in a controlled experimental setting, including videos from (n = 29) Labrador Retrievers assumed to be in two experimentally induced emotional states: negative (frustration) and positive (anticipation). The dogs' facial expressions were measured using the Dogs Facial Action Coding System (DogFACS). Two different approaches are compared in relation to our aim: (1) a DogFACS-based approach with a two-step pipeline consisting of (i) a DogFACS variable detector and (ii) a positive/negative state Decision Tree classifier; (2) An approach using deep learning techniques with no intermediate representation. The approaches reach accuracy of above 71% and 89%, respectively, with the deep learning approach performing better. Secondly, this study is also the first to study explainability of AI models in the context of emotion in animals. The DogFACS-based approach provides decision trees, that is a mathematical representation which reflects previous findings by human experts in relation to certain facial expressions (DogFACS variables) being correlates of specific emotional states. The deep learning approach offers a different, visual form of explainability in the form of heatmaps reflecting regions of focus of the network's attention, which in some cases show focus clearly related to the nature of particular DogFACS variables. These heatmaps may hold the key to novel insights on the sensitivity of the network to nuanced pixel patterns reflecting information invisible to the human eye.


Assuntos
Reconhecimento Facial , Frustração , Animais , Cães , Humanos , Expressão Facial , Emoções , Atenção , Reconhecimento Psicológico
5.
Sensors (Basel) ; 22(22)2022 Nov 17.
Artigo em Inglês | MEDLINE | ID: mdl-36433493

RESUMO

RGB and depth cameras are extensively used for the 3D tracking of human pose and motion. Typically, these cameras calculate a set of 3D points representing the human body as a skeletal structure. The tracking capabilities of a single camera are often affected by noise and inaccuracies due to occluded body parts. Multiple-camera setups offer a solution to maximize coverage of the captured human body and to minimize occlusions. According to best practices, fusing information across multiple cameras typically requires spatio-temporal calibration. First, the cameras must synchronize their internal clocks. This is typically performed by physically connecting the cameras to each other using an external device or cable. Second, the pose of each camera relative to the other cameras must be calculated (Extrinsic Calibration). The state-of-the-art methods use specialized calibration session and devices such as a checkerboard to perform calibration. In this paper, we introduce an approach to the spatio-temporal calibration of multiple cameras which is designed to run on-the-fly without specialized devices or equipment requiring only the motion of the human body in the scene. As an example, the system is implemented and evaluated using Microsoft Azure Kinect. The study shows that the accuracy and robustness of this approach is on par with the state-of-the-art practices.


Assuntos
Calibragem , Humanos , Movimento (Física)
6.
Sci Rep ; 12(1): 9575, 2022 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-35688852

RESUMO

Facial expressions in non-human animals are closely linked to their internal affective states, with the majority of empirical work focusing on facial shape changes associated with pain. However, existing tools for facial expression analysis are prone to human subjectivity and bias, and in many cases also require special expertise and training. This paper presents the first comparative study of two different paths towards automatizing pain recognition in facial images of domestic short haired cats (n = 29), captured during ovariohysterectomy at different time points corresponding to varying intensities of pain. One approach is based on convolutional neural networks (ResNet50), while the other-on machine learning models based on geometric landmarks analysis inspired by species specific Facial Action Coding Systems (i.e. catFACS). Both types of approaches reach comparable accuracy of above 72%, indicating their potential usefulness as a basis for automating cat pain detection from images.


Assuntos
Expressão Facial , Reconhecimento Facial , Animais , Gatos , Emoções , Face , Humanos , Dor/veterinária , Reconhecimento Psicológico
7.
Sensors (Basel) ; 22(4)2022 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-35214471

RESUMO

Automating fall risk assessment, in an efficient, non-invasive manner, specifically in the elderly population, serves as an efficient means for implementing wide screening of individuals for fall risk and determining their need for participation in fall prevention programs. We present an automated and efficient system for fall risk assessment based on a multi-depth camera human motion tracking system, which captures patients performing the well-known and validated Berg Balance Scale (BBS). Trained machine learning classifiers predict the patient's 14 scores of the BBS by extracting spatio-temporal features from the captured human motion records. Additionally, we used machine learning tools to develop fall risk predictors that enable reducing the number of BBS tasks required to assess fall risk, from 14 to 4-6 tasks, without compromising the quality and accuracy of the BBS assessment. The reduced battery, termed Efficient-BBS (E-BBS), can be performed by physiotherapists in a traditional setting or deployed using our automated system, allowing an efficient and effective BBS evaluation. We report on a pilot study, run in a major hospital, including accuracy and statistical evaluations. We show the accuracy and confidence levels of the E-BBS, as well as the average number of BBS tasks required to reach the accuracy thresholds. The trained E-BBS system was shown to reduce the number of tasks in the BBS test by approximately 50% while maintaining 97% accuracy. The presented approach enables a wide screening of individuals for fall risk in a manner that does not require significant time or resources from the medical community. Furthermore, the technology and machine learning algorithms can be implemented on other batteries of medical tests and evaluations.


Assuntos
Acidentes por Quedas , Equilíbrio Postural , Acidentes por Quedas/prevenção & controle , Idoso , Humanos , Aprendizado de Máquina , Projetos Piloto , Medição de Risco
8.
Sensors (Basel) ; 23(1)2022 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-36616978

RESUMO

In this paper, we present a framework for 3D gaze estimation intended to identify the user's focus of attention in a corneal imaging system. The framework uses a headset that consists of three cameras, a scene camera and two eye cameras: an IR camera and an RGB camera. The IR camera is used to continuously and reliably track the pupil and the RGB camera is used to acquire corneal images of the same eye. Deep learning algorithms are trained to detect the pupil in IR and RGB images and to compute a per user 3D model of the eye in real time. Once the 3D model is built, the 3D gaze direction is computed starting from the eyeball center and passing through the pupil center to the outside world. This model can also be used to transform the pupil position detected in the IR image into its corresponding position in the RGB image and to detect the gaze direction in the corneal image. This technique circumvents the problem of pupil detection in RGB images, which is especially difficult and unreliable when the scene is reflected in the corneal images. In our approach, the auto-calibration process is transparent and unobtrusive. Users do not have to be instructed to look at specific objects to calibrate the eye tracker. They need only to act and gaze normally. The framework was evaluated in a user study in realistic settings and the results are promising. It achieved a very low 3D gaze error (2.12°) and very high accuracy in acquiring corneal images (intersection over union-IoU = 0.71). The framework may be used in a variety of real-world mobile scenarios (indoors, indoors near windows and outdoors) with high accuracy.


Assuntos
Movimentos Oculares , Fixação Ocular , Algoritmos , Pupila , Córnea
9.
J Imaging ; 9(1)2022 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-36662107

RESUMO

Computer vision and robotics are more and more involved in cultural heritage [...].

10.
Gastrointest Endosc ; 94(6): 1099-1109.e10, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34216598

RESUMO

BACKGROUND AND AIMS: Colorectal cancer is a leading cause of death. Colonoscopy is the criterion standard for detection and removal of precancerous lesions and has been shown to reduce mortality. The polyp miss rate during colonoscopies is 22% to 28%. DEEP DEtection of Elusive Polyps (DEEP2) is a new polyp detection system based on deep learning that alerts the operator in real time to the presence and location of polyps. The primary outcome was the performance of DEEP2 on the detection of elusive polyps. METHODS: The DEEP2 system was trained on 3611 hours of colonoscopy videos derived from 2 sources and was validated on a set comprising 1393 hours from a third unrelated source. Ground truth labeling was provided by offline gastroenterologist annotators who were able to watch the video in slow motion and pause and rewind as required. To assess applicability, stability, and user experience and to obtain some preliminary data on performance in a real-life scenario, a preliminary prospective clinical validation study was performed comprising 100 procedures. RESULTS: DEEP2 achieved a sensitivity of 97.1% at 4.6 false alarms per video for all polyps and of 88.5% and 84.9% for polyps in the field of view for less than 5 and 2 seconds, respectively. DEEP2 was able to detect polyps not seen by live real-time endoscopists or offline annotators in an average of .22 polyps per sequence. In the clinical validation study, the system detected an average of .89 additional polyps per procedure. No adverse events occurred. CONCLUSIONS: DEEP2 has a high sensitivity for polyp detection and was effective in increasing the detection of polyps both in colonoscopy videos and in real procedures with a low number of false alarms. (Clinical trial registration number: NCT04693078.).


Assuntos
Pólipos Adenomatosos , Pólipos do Colo , Neoplasias Colorretais , Inteligência Artificial , Pólipos do Colo/diagnóstico , Colonoscopia , Neoplasias Colorretais/diagnóstico , Humanos , Estudos Prospectivos
11.
IEEE Trans Med Imaging ; 39(11): 3451-3462, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32746092

RESUMO

Colonoscopy is tool of choice for preventing Colorectal Cancer, by detecting and removing polyps before they become cancerous. However, colonoscopy is hampered by the fact that endoscopists routinely miss 22-28% of polyps. While some of these missed polyps appear in the endoscopist's field of view, others are missed simply because of substandard coverage of the procedure, i.e. not all of the colon is seen. This paper attempts to rectify the problem of substandard coverage in colonoscopy through the introduction of the C2D2 (Colonoscopy Coverage Deficiency via Depth) algorithm which detects deficient coverage, and can thereby alert the endoscopist to revisit a given area. More specifically, C2D2 consists of two separate algorithms: the first performs depth estimation of the colon given an ordinary RGB video stream; while the second computes coverage given these depth estimates. Rather than compute coverage for the entire colon, our algorithm computes coverage locally, on a segment-by-segment basis; C2D2 can then indicate in real-time whether a particular area of the colon has suffered from deficient coverage, and if so the endoscopist can return to that area. Our coverage algorithm is the first such algorithm to be evaluated in a large-scale way; while our depth estimation technique is the first calibration-free unsupervised method applied to colonoscopies. The C2D2 algorithm achieves state of the art results in the detection of deficient coverage. On synthetic sequences with ground truth, it is 2.4 times more accurate than human experts; while on real sequences, C2D2 achieves a 93.0% agreement with experts.


Assuntos
Neoplasias do Colo , Pólipos do Colo , Algoritmos , Pólipos do Colo/diagnóstico por imagem , Colonoscopia , Humanos
12.
IEEE Trans Pattern Anal Mach Intell ; 41(12): 2846-2860, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-30207949

RESUMO

Correctly matching feature points in a pair of images is an important preprocessing step for many computer vision applications. In this paper we propose an efficient method for estimating the number of correct matches without explicitly computing them. To this end, we propose to analyze the set of matches using the spatial order of the features, as projected to the x-axis of the image. The set of features in each image is thus represented by a sequence, and analyzed using the Kendall and Spearman Footrule distance metrics between permutations. This result is interesting in its own right. Moreover, we demonstrate three useful applications of our method: (i) a new halting condition for RANSAC based epipolar geometry estimation methods, (ii) discarding spatially unrelated image pairs in the Structure-from-Motion pipeline, and (iii) computing the probability that a given match is correct based on the rank of the features within the sequences. Our experiments on a large number of synthetic and real data demonstrate the effectiveness of our method. For example, the running time of the image matching stage in the Structure-from-Motion pipeline may be reduced by about 90 percent while preserving about 85 percent of the image pairs with spatial overlap.

13.
IEEE Trans Pattern Anal Mach Intell ; 39(2): 411-416, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-27019475

RESUMO

Sparse and redundant representations, where signals are modeled as a combination of a few atoms from an overcomplete dictionary, is increasingly used in many image processing applications, such as denoising, super resolution, and classification. One common problem is learning a "good" dictionary for different tasks. In the classification task the aim is to learn a dictionary that also takes training labels into account, and indeed there exist several approaches to this problem. One well-known technique is D-KSVD, which jointly learns a dictionary and a linear classifier using the K-SVD algorithm. LC-KSVD is a recent variation intended to further improve on this idea by adding an explicit label consistency term to the optimization problem, so that different classes are represented by different dictionary atoms. In this work we prove that, under identical initialization conditions, LC-KSVD with uniform atom allocation is in fact a reformulation of D-KSVD: given the regularization parameters of LC-KSVD, we give a closed-form expression for the equivalent D-KSVD regularization parameter, assuming the LC-KSVD's initialization scheme is used. We confirm this by reproducing several of the original LC-KSVD experiments.

14.
Comput Med Imaging Graph ; 43: 150-64, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25804442

RESUMO

In this paper, we introduce a novel method for detection and segmentation of crypts in colon biopsies. Most of the approaches proposed in the literature try to segment the crypts using only the biopsy image without understanding the meaning of each pixel. The proposed method differs in that we segment the crypts using an automatically generated pixel-level classification image of the original biopsy image and handle the artifacts due to the sectioning process and variance in color, shape and size of the crypts. The biopsy image pixels are classified to nuclei, immune system, lumen, cytoplasm, stroma and goblet cells. The crypts are then segmented using a novel active contour approach, where the external force is determined by the semantics of each pixel and the model of the crypt. The active contour is applied for every lumen candidate detected using the pixel-level classification. Finally, a false positive crypt elimination process is performed to remove segmentation errors. This is done by measuring their adherence to the crypt model using the pixel level classification results. The method was tested on 54 biopsy images containing 4944 healthy and 2236 cancerous crypts, resulting in 87% detection of the crypts with 9% of false positive segments (segments that do not represent a crypt). The segmentation accuracy of the true positive segments is 96%.


Assuntos
Neoplasias do Colo/patologia , Técnicas Histológicas , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Biópsia , Cor , Humanos , Valor Preditivo dos Testes , Coloração e Rotulagem
15.
IEEE Trans Pattern Anal Mach Intell ; 36(12): 2381-95, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26353146

RESUMO

Algorithms for the estimation of epipolar geometry from a pair of images have been very successful in dealing with challenging wide baseline images. In this paper the problem of scenes with repeated structures is addressed, dealing with the common case where the overlap between the images consists mainly of facades of a building. These facades may contain many repeated structures that can not be matched locally, causing state-of-the-art algorithms to fail. Assuming that the repeated structures lie on a planar surface in an ordered fashion the goal is to match them. Our algorithm first rectifies the images such that the facade is fronto-parallel. It then clusters similar features in each of the two images and matches the clusters. From them a set of hypothesized homographies of the facade is generated, using local groups of features. For each homography the epipole is recovered, yielding a fundamental matrix. For the best solution, it then decides whether the fundamental matrix has been recovered reliably and, if not, returns only the homography. The algorithm has been tested on a large number of challenging image pairs of buildings from the benchmark ZuBuD database, outperforming several state-of-the-art algorithms.

16.
IEEE Trans Pattern Anal Mach Intell ; 34(12): 2327-40, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22331857

RESUMO

It is quite common that multiple human observers attend to a single static interest point. This is known as a mutual awareness event (MAWE). A preferred way to monitor these situations is with a camera that captures the human observers while using existing face detection and head pose estimation algorithms. The current work studies the underlying geometric constraints of MAWEs and reformulates them in terms of image measurements. The constraints are then used in a method that 1) detects whether such an interest point does exist, 2) determines where it is located, 3) identifies who was attending to it, and 4) reports where and when each observer was while attending to it. The method is also applied on another interesting event when a single moving human observer fixates on a single static interest point. The method can deal with the general case of an uncalibrated camera in a general environment. This is in contrast to other work on similar problems that inherently assumes a known environment or a calibrated camera. The method was tested on about 75 images from various scenes and robustly detects MAWEs and estimates their related attributes. Most of the images were found by searching the Internet.


Assuntos
Algoritmos , Conscientização/fisiologia , Processamento de Imagem Assistida por Computador/métodos , Postura/fisiologia , Processamento de Sinais Assistido por Computador , Comportamento Social , Teorema de Bayes , Bases de Dados Factuais , Face/anatomia & histologia , Humanos , Imageamento Tridimensional/métodos , Comportamento de Massa
17.
IEEE Trans Pattern Anal Mach Intell ; 33(2): 325-37, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21193810

RESUMO

In this work, we recover the 3D shape of mirrors, sunglasses, and stainless steel implements. A computer monitor displays several images of parallel stripes, each image at a different angle. Reflections of these stripes in a mirroring surface are captured by the camera. For every image point, the direction of the displayed stripes and their reflections in the image are related by a 1D homography matrix, computed with a robust version of the statistically accurate heteroscedastic approach. By focusing on a sparse set of image points for which monitor-image correspondence is computed, the depth and the local shape may be estimated from these homographies. The depth estimation relies on statistically correct minimization and provides accurate, reliable results. Even for the image points where the depth estimation process is inherently unstable, we are able to characterize this instability and develop an algorithm to detect and correct it. After correcting the instability, dense surface recovery of mirroring objects is performed using constrained interpolation, which does not simply interpolate the surface depth values but also uses the locally computed 1D homographies to solve for the depth, the correspondence, and the local surface shape. The method was implemented and the shape of several objects was densely recovered at submillimeter accuracy.

18.
IEEE Trans Pattern Anal Mach Intell ; 31(7): 1310-24, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19443927

RESUMO

We present a new method for recovering the 3D shape of a featureless smooth surface from three or more calibrated images illuminated by different light sources (three of them are independent). This method is unique in its ability to handle images taken from unconstrained perspective viewpoints and unconstrained illumination directions. The correspondence between such images is hard to compute and no other known method can handle this problem locally from a small number of images. Our method combines geometric and photometric information in order to recover dense correspondence between the images and accurately computes the 3D shape. Only a single pass starting at one point and local computation are used. This is in contrast to methods that use the occluding contours recovered from many images to initialize and constrain an optimization process. The output of our method can be used to initialize such processes. In the special case of fixed viewpoint, the proposed method becomes a new perspective photometric stereo algorithm. Nevertheless, the introduction of the multiview setup, self-occlusions, and regions close to the occluding boundaries are better handled, and the method is more robust to noise than photometric stereo. Experimental results are presented for simulated and real images.


Assuntos
Algoritmos , Inteligência Artificial , Face/anatomia & histologia , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
19.
IEEE Trans Pattern Anal Mach Intell ; 30(7): 1230-42, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18550905

RESUMO

The estimation of the epipolar geometry is especially difficult when the putative correspondences include a low percentage of inlier correspondences and/or a large subset of the inliers is consistent with a degenerate configuration of the epipolar geometry that is totally incorrect. This work presents the Balanced Exploration and Exploitation Model Search (BEEM) algorithm that works very well especially for these difficult scenes. The algorithm handles these two problems in a unified manner. It includes the following main features: (1) Balanced use of three search techniques: global random exploration, local exploration near the current best solution and local exploitation to improve the quality of the model. (2) Exploits available prior information to accelerate the search process. (3) Uses the best found model to guide the search process, escape from degenerate models and to define an efficient stopping criterion. (4) Presents a simple and efficient method to estimate the epipolar geometry from two SIFT correspondences. (5) Uses the locality-sensitive hashing (LSH) approximate nearest neighbor algorithm for fast putative correspondences generation. The resulting algorithm when tested on real images with or without degenerate configurations gives quality estimations and achieves significant speedups compared to the state of the art algorithms.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador , Aumento da Imagem/métodos , Modelos Estatísticos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
20.
IEEE Trans Syst Man Cybern B Cybern ; 38(3): 826-45, 2008 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-18558545

RESUMO

We present a method for the recovery of partially occluded 3-D geometric primitives from range images which might also include nonprimitive objects. The method uses a technique for estimating the principal curvatures and Darboux frame from range images. After estimating the principal curvatures and the Darboux frames from the entire scene, a search for the known patterns of these features in geometric primitives is performed. If a specific pattern is identified, then the presence of the corresponding primitive is confirmed by using these local features. The features are also used to recover the primitive's characteristics. The suggested application is very efficient since it combines the segmentation, classification, and fitting processes, which are part of any recovery process, in a single process, which advances monotonously through the recovery procedure. We view the problem as a robust statistics problem, and we therefore use techniques from that field. A mean-shift-based algorithm is used for the robust estimation of shape parameters, such as recognizing which types of shapes in the scene exist and, after that, full recovery of planes, spheres, and cylinders. A random-sample-consensus-based algorithm is used for robust model estimation for the more complex primitives, such as cones and tori. As a result of these algorithms, a set of proposed primitives is found. This set contains superfluous models which cannot be detected at this stage. To deal with this problem, a minimum-description-length method has been developed, which selects a subset of models that best describes the scene. The method has been tested on series of real complex cluttered scenes, yielding accurate and robust recoveries of primitives.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...